This paper focuses on the problem of script identification in unconstrainedscenarios. Script identification is an important prerequisite to recognition,and an indispensable condition for automatic text understanding systemsdesigned for multi-language environments. Although widely studied for documentimages and handwritten documents, it remains an almost unexplored territory forscene text images. We detail a novel method for script identification in natural images thatcombines convolutional features and the Naive-Bayes Nearest Neighborclassifier. The proposed framework efficiently exploits the discriminativepower of small stroke-parts, in a fine-grained classification framework. In addition, we propose a new public benchmark dataset for the evaluation ofjoint text detection and script identification in natural scenes. Experimentsdone in this new dataset demonstrate that the proposed method yields state ofthe art results, while it generalizes well to different datasets and variablenumber of scripts. The evidence provided shows that multi-lingual scene textrecognition in the wild is a viable proposition. Source code of the proposedmethod is made available online.
展开▼